Feature Selection by a Genetic Algorithm. Application to Seed Discrimination by Artificial Vision

نویسندگان

  • Younes Chtioui
  • Dominique Bertrand
  • Dominique Barba
چکیده

Genetic algorithms (GAs) are efficient search methods based on the paradigm of natural selection and population genetics. A simple GA was applied for selecting the optimal feature subset among an initial feature set of larger size. The performances were tested on a practical pattern recognition problem, which consisted on the discrimination between four seed species (two cultivated and two adventitious seed species) by artiÐcial vision. A set of 73 features, describing size, shape and texture, were extracted from colour images in order to characterise each seed. The goal of the GA was to select the best subset of features which gave the highest classiÐcation rates when using the nearest neighbour as a classiÐcation method. The selected features were represented by binary chromosomes which had 73 elements. The number of selected features was directly related to the probability of initialisation of the population at the Ðrst generation of the GA. When this probability was Ðxed to 0É1, the GA selected about Ðve features. The classiÐcation performances increased with the number of generations. For example, 6É25% of the seeds were misclassiÐed by using Ðve features at generation 140, whereas another subset of the same size led to 3% misclassiÐcation at generation 400. The present work shows the great potential of GAs for feature selection (dimensionality reduction) problems. 1998 SCI. ( J Sci Food Agric 76, 77È86 (1998)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature selection using genetic algorithm for classification of schizophrenia using fMRI data

In this paper we propose a new method for classification of subjects into schizophrenia and control groups using functional magnetic resonance imaging (fMRI) data. In the preprocessing step, the number of fMRI time points is reduced using principal component analysis (PCA). Then, independent component analysis (ICA) is used for further data analysis. It estimates independent components (ICs) of...

متن کامل

Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets

Objective(s): This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets. Materials and Methods: To evaluate effectiveness of proposed feature selection method, we ...

متن کامل

Feature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine

Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods.  In filter methods, features subsets are selected due to some measu...

متن کامل

Appraisal of the evolutionary-based methodologies in generation of artificial earthquake time histories

Through the last three decades different seismological and engineering approaches for the generation of artificial earthquakes have been proposed. Selection of an appropriate method for the generation of applicable artificial earthquake accelerograms (AEAs) has been a challenging subject in the time history analysis of the structures in the case of the absence of sufficient recorded accelerogra...

متن کامل

A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997